Slide Deck

Multichannel TIFF Data

Here I briefly discuss loading and visualizing raw TIFF data. Typically I focus on data in tabular form post segmentation (not the raw images), but it’s helpful to see what they look like.

Lung cancer data

The following image is a single ROI from a tissue slice from a patient with non small cell lung carcinoma.Cell and tissue segmentation and cell phenotyping were performed using inForm software. For each image, there are multiple Tiff files output by inForm:

  • lung_component_data.tif: the multichannel file we are most interested in
  • lung_binary_seg_maps.tif: 3 channel file with binary segmentation maps
  • lung.tif: A composite image in RGB format.

# load relevant libraries
library(tidyverse)
library(tiff)
library(EBImage)


# define your specific file path
my_path = "./Data/"

# define file for different .tif files
img = paste0(my_path, "lung_component_data.tif")
segmentation = paste0(my_path, "lung_binary_seg_maps.tif")
composite_tif = paste0(my_path, "lung.tif")

Image stack file

The component.tif file contains the multichannel tiff, which is a stack of individual images (one for each marker). Below I define a function to read the image stack file and extract metadata.

# this function is from phenoptr package on GitHub
# extracts metadata specific to Vectra3/VectraPolaris images
read_image_stacks <- function(path) {
  stopifnot(file.exists(path), endsWith(path, 'component_data.tif'))

  tif = tiff::readTIFF(path, all=TRUE, info=TRUE)

  # Get the image descriptions and figure out which ones are components
  infos = purrr::map_chr(tif, ~attr(., 'description'))
  images = grepl('FullResolution', infos)

  # Get the image_stack names
  names = stringr::str_match(infos[images], '<Name>(.*)</Name>')[, 2]

  purrr::set_names(tif[images], names)
}

image_stack = read_image_stacks(img)

The image_stack object is now a list, where each list element is a different marker

map(image_stack, dim)
## $`CD19 (Opal 650)`
## [1] 1008 1348
## 
## $`CD3 (Opal 520)`
## [1] 1008 1348
## 
## $`CD14 (Opal 540)`
## [1] 1008 1348
## 
## $`CD8 (Opal 620)`
## [1] 1008 1348
## 
## $`HLADR (Opal 690)`
## [1] 1008 1348
## 
## $`CK (Opal 570)`
## [1] 1008 1348
## 
## $`DAPI (DAPI)`
## [1] 1008 1348
## 
## $Autofluorescence
## [1] 1008 1348

I can use the display() function from the EBImage package to plot an individual channel. Below the DAPI (nucleus) channel is plotted.

dapi_channel = t(image_stack$"DAPI (DAPI)")
EBImage::display(dapi_channel, method = "raster")

Now we plot the CD3 channel (a protein marker for T-cells).

tcell_channel = t(image_stack$"CD3 (Opal 520)")
EBImage::display(tcell_channel, method = "raster")

Segmentation files

This data was collected on a Vectra 3 instrument and segmented using inForm tissue analysis software from Akoya Biosciences. This is proprietary image analysis software that can perform cell and tissue segmentation and cell phenotyping. The segmentation below were performed by that software.

# source phenoptr code for loading file and metadata, based on tiff::readTIFF.
source("https://raw.githubusercontent.com/akoyabio/phenoptr/main/R/read_maps.R")

segmentations = read_maps(segmentation)
names(segmentations)
## [1] "Nucleus"  "Membrane" "Tissue"

Nucleus segmentation

nucleus = segmentations[['Membrane']]
plot(as.raster(nucleus, max = max(nucleus)))

Nucleus segmentation

nucleus = segmentations[['Nucleus']]
plot(as.raster(nucleus, max = max(nucleus)))

Composite file

# source phenoptr code for loading file and metadata, based on tiff::readTIFF.
source("https://raw.githubusercontent.com/akoyabio/phenoptr/main/R/read_composites.R")

composite = read_composites(composite_tif)
composite = as.raster(composite[[1]])
plot(composite)

rm(dapi_channel, image_stack, nucleus,segmentation, tissue, composite, dapi)
## Warning in rm(dapi_channel, image_stack, nucleus, segmentation, tissue, :
## object 'tissue' not found
## Warning in rm(dapi_channel, image_stack, nucleus, segmentation, tissue, :
## object 'dapi' not found

VectraPolarisData

The VectraPolarisData ExperimentHub package provides two large multiplex immunofluorescence datasets collected using Akoya Biosciences Vectra 3 and Vectra Polaris platforms. Image preprocessing (cell segmentation and phenotyping) was performed using inForm software. Data are provided as objects of class SpatialExperiment.

# load libraries
library(tidyverse)

Install and load data

The data is publicly available on Bioconductor as the package VectraPolarisData. You can install the package from Bioconductor here:

if (!requireNamespace("BiocManager", quietly = TRUE)) {
     install.packages("BiocManager")
}

BiocManager::install("VectraPolarisData")

Ovarian cancer data

Load the high grade serous ovarian cancer data. This dataset has been segmented and phenotyped, and has one image per patient. I will be working with a form of this data that has already been preprocessed into a tidy dataframe.

# load processed ovarian cancer data
load(url("https://github.com/julia-wrobel/MI_tutorial/raw/main/Data/ovarian.RDA"))

The resulting dataframe, ovarian_df, contains information on marker intensities, cell X and Y position, cell phenotypes, and patient-level characteristics.

ovarian_df %>%
  select(contains("id"),tissue_category, cd8, phenotype_cd68, x, y, everything()) %>%
  as_tibble() %>%
  head()
## # A tibble: 6 × 35
##   sample_id cell_id slide_id  tissue_category   cd8 phenotype_cd68      x      y
##       <dbl>   <dbl> <chr>     <chr>           <dbl> <chr>           <dbl>  <dbl>
## 1        68       1 030120 P… Stroma          0.251 ""             12185. 34524.
## 2        68       2 030120 P… Tumor           0.276 ""             12233  34524.
## 3        68       3 030120 P… Stroma          0.279 ""             12254. 34524.
## 4        68       4 030120 P… Stroma          0.48  "CD68-"        12031. 34525.
## 5        68       5 030120 P… Stroma          0.422 "CD68-"        11848. 34531.
## 6        68       6 030120 P… Stroma          0.279 "CD68-"        11875. 34535.
## # ℹ 27 more variables: primary <dbl>, recurrent <dbl>, treatment_effect <dbl>,
## #   stage <chr>, survival_time <dbl>, death <dbl>, BRCA_mutation <dbl>,
## #   age_at_diagnosis <dbl>, time_to_recurrence <chr>, parpi_inhibitor <chr>,
## #   debulking <chr>, stage_bin <dbl>, phenotype_ki67 <chr>, phenotype_ck <chr>,
## #   phenotype_cd19 <chr>, phenotype_p_stat3 <chr>, phenotype_cd3 <chr>,
## #   phenotype_cd8 <chr>, ck <dbl>, ki67 <dbl>, ier3 <dbl>, pstat3 <dbl>,
## #   cd3 <dbl>, cd68 <dbl>, cd19 <dbl>, dapi <dbl>, immune <chr>

Below we plot the cells for a single subject. In this dataset there is one image for each subject, and the image comes from a tumor microarray (TMA).


set.seed(7332)
id = sample(ovarian_df$sample_id, 1)

ovarian_df %>%
  filter(sample_id == id) %>%
  ggplot(aes(x, y)) +
  geom_point(aes(color = tissue_category), size = 0.2)

ovarian_df %>%
  filter(sample_id == id) %>%
  mutate(phenotype_cd68 = ifelse(phenotype_cd68 == "CD68+", "CD68+", "CD68-")) %>%
  ggplot(aes(x, y)) +
  geom_point(aes(color = phenotype_cd68), alpha = 0.8, size = 0.7)

Lung data

Load the non small cell lung carcinoma data and inspect the SpatialExperiment object. This dataset has 3-4 images per patient, representing different regions of interest (ROIs) from a tissue slice. In this dataset, each patient was imaged on a separate slide.

# load processed ovarian cancer data
load(url("https://github.com/julia-wrobel/MI_tutorial/raw/main/Data/lung.RDA"))
set.seed(23423)
id = sample(lung_df$patient_id, 1)

lung_df %>%
  filter(patient_id == id) %>%
  ggplot(aes(x, y)) +
  geom_point(aes(color = tissue_category), size = 0.3) +
  facet_wrap(~image_id)

Spatial Analysis using spatstat package

The spatstat package in R has great resources for analyzing spatial point patterns. Let’s use the image above to do some exploratory analysis with spatstat.

We will do spatial analysis on the distribution of B-cells and macrophages, so we subset to only these cell types, and create a phenotype variable that designates whether a cell is a B cell or a macrophage. Then we create a ppp object, which is what the spatstat package uses for storage and analysis of point pattern data.

subj_df = filter(ovarian_df, sample_id == 7) 

library(spatstat)

subj_df = subj_df %>%
  # create a phenotype variable that is B-cell, macrophage, or other
  mutate(phenotype = case_when(
    phenotype_cd68 == "CD68+" ~ "macrophage",
    phenotype_cd19 == "CD19+" ~ "B-cell",
    TRUE ~ "other"
  )) %>%
  filter(phenotype != "other") %>%
  select(x,y, immune, tissue_category, phenotype)

# first define window of observation for your image
w = convexhull.xy(subj_df$x,subj_df$y)

# create ppp object as multitype point pattern
# need to factor the marks variable for ppp object to be treated as a multitype point pattern
ovarian_pp = ppp(subj_df$x,subj_df$y, window = w,
                 marks = factor(subj_df$phenotype))

Univariate Ripley’s K

First we calculate Ripley’s K for B cells.

k_bcell =  Kest(subset(ovarian_pp, marks == "B-cell"), correction = "isotropic")

We can plot this k_bcell object using spatstat to visualize the estimated K (\(\hat{K}^{iso}\)) compared to the K under CSR (\(K^{pois}\)).

plot(k_bcell)

Now visualize the K-function for macrophages in this image:

k_mac =  Kest(subset(ovarian_pp, marks == "macrophage"), 
              correction = "isotropic")
plot(k_mac)

It appears that there is some degree of clustering for both B-cells and macrophages in this image. Are these statistically significant?

We can test this using the envelope function, which performs Monte Carlo simulations of the K function under CSR. Plot below shows pointwise envelope for B-cell K-function. Pointwise means that these are performed separately for each value of r. Can only be interpreted if a specific value of r is chosen in advance.


e_test = envelope(subset(ovarian_pp, marks == "B-cell"), 
               Kest, correction = "isotropic", global = TRUE)
## Generating 99 simulations of CSR  ...
## 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
## 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
## 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
## 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
## 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
## 99.
## 
## Done.

plot(e_test)

Points outside the envelope indicate significant departure from CSR. Note that this could be due to inhomogeneity in the distribution of cells, or due to clustering. See spatstat::Kinhom function for a test that takes into account inhomogeneity. See spatstat::Lest function to compute the L-function.

Bivariate Ripley’s K

The bivariate K function measures whether the spatial distribution of 2 cell types is independent. Deviation from the \(K^{pois}\) line suggests a spatial relationship between the B-cells and macrophages in the image.

k_biv = Kcross(ovarian_pp, "B-cell", "macrophage", correction = "isotropic")

plot(k_biv)